Search CORE

7 research outputs found

ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference

Author: Dorndorf Nik
Durasov Nikita
Fua Pascal
Publication venue
Publication date: 21/11/2022
Field of study

Whereas the ability of deep networks to produce useful predictions on many kinds of data has been amply demonstrated, estimating the reliability of these predictions remains challenging. Sampling approaches such as MC-Dropout and Deep Ensembles have emerged as the most popular ones for this purpose. Unfortunately, they require many forward passes at inference time, which slows them down. Sampling-free approaches can be faster but suffer from other drawbacks, such as lower reliability of uncertainty estimates, difficulty of use, and limited applicability to different types of tasks and data. In this work, we introduce a sampling-free approach that is generic and easy to deploy, while producing reliable uncertainty estimates on par with state-of-the-art methods at a significantly lower computational cost. It is predicated on training the network to produce the same output with and without additional information about that output. At inference time, when no prior information is given, we use the network's own prediction as the additional information. We prove that the difference between the two predictions is an accurate uncertainty estimate and demonstrate our approach on various types of tasks and applications

arXiv.org e-Print Archive

PartAL: Efficient Partial Active Learning in Multi-Task Visual Settings

Author: Dorndorf Nik
Durasov Nikita
Fua Pascal
Publication venue
Publication date: 21/11/2022
Field of study

Multi-task learning is central to many real-world applications. Unfortunately, obtaining labelled data for all tasks is time-consuming, challenging, and expensive. Active Learning (AL) can be used to reduce this burden. Existing techniques typically involve picking images to be annotated and providing annotations for all tasks. In this paper, we show that it is more effective to select not only the images to be annotated but also a subset of tasks for which to provide annotations at each AL iteration. Furthermore, the annotations that are provided can be used to guess pseudo-labels for the tasks that remain unannotated. We demonstrate the effectiveness of our approach on several popular multi-task datasets

arXiv.org e-Print Archive

Double Refinement Network for Efficient Indoor Monocular Depth Estimation

Author: Bogomolov Pavel
Bubnova Valeriya
Durasov Nikita
Konushin Anton
Romanov Mikhail
Publication venue
Publication date: 04/04/2019
Field of study

Monocular depth estimation is the task of obtaining a measure of distance for each pixel using a single image. It is an important problem in computer vision and is usually solved using neural networks. Though recent works in this area have shown significant improvement in accuracy, the state-of-the-art methods tend to require massive amounts of memory and time to process an image. The main purpose of this work is to improve the performance of the latest solutions with no decrease in accuracy. To this end, we introduce the Double Refinement Network architecture. The proposed method achieves state-of-the-art results on the standard benchmark RGB-D dataset NYU Depth v2, while its frames per second rate is significantly higher (up to 18 times speedup per image at batch size 1) and the RAM usage per image is lower

arXiv.org e-Print Archive

Masksembles for Uncertainty Estimation

Author: Bagautdinov Timur
Baque Pierre
Durasov Nikita
Fua Pascal
Publication venue
Publication date: 25/06/2021
Field of study

Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. Deep Ensembles are widely considered as being one of the best methods for generating uncertainty estimates but are very expensive to train and evaluate. MC-Dropout is another popular alternative, which is less expensive, but also less reliable. Our central intuition is that there is a continuous spectrum of ensemble-like models of which MC-Dropout and Deep Ensembles are extreme examples. The first uses an effectively infinite number of highly correlated models while the second relies on a finite number of independent models. To combine the benefits of both, we introduce Masksembles. Instead of randomly dropping parts of the network as in MC-dropout, Masksemble relies on a fixed number of binary masks, which are parameterized in a way that allows to change correlations between individual models. Namely, by controlling the overlap between the masks and their density one can choose the optimal configuration for the task at hand. This leads to a simple and easy to implement method with performance on par with Ensembles at a fraction of the cost. We experimentally validate Masksembles on two widely used datasets, CIFAR10 and ImageNet

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

How to Boost Face Recognition with StyleGAN?

Author: Durasov Nikita
Malkov Yury
Nießner Matthias
Sevastopolsky Artem
Verdoliva Luisa
Publication venue
Publication date: 28/07/2023
Field of study

State-of-the-art face recognition systems require vast amounts of labeled training data. Given the priority of privacy in face recognition applications, the data is limited to celebrity web crawls, which have issues such as limited numbers of identities. On the other hand, self-supervised revolution in the industry motivates research on the adaptation of related techniques to facial recognition. One of the most popular practical tricks is to augment the dataset by the samples drawn from generative models while preserving the identity. We show that a simple approach based on fine-tuning pSp encoder for StyleGAN allows us to improve upon the state-of-the-art facial recognition and performs better compared to training on synthetic face identities. We also collect large-scale unlabeled datasets with controllable ethnic constitution -- AfricanFaceSet-5M (5 million images of different people) and AsianFaceSet-3M (3 million images of different people) -- and we show that pretraining on each of them improves recognition of the respective ethnicities (as well as others), while combining all unlabeled datasets results in the biggest performance increase. Our self-supervised strategy is the most useful with limited amounts of labeled training data, which can be beneficial for more tailored face recognition tasks and when facing privacy concerns. Evaluation is based on a standard RFW dataset and a new large-scale RB-WebFace benchmark. The code and data are made publicly available at https://github.com/seva100/stylegan-for-facerec.Comment: 16 pages, 9 figures, 11 tables; accepted to ICCV 202

arXiv.org e-Print Archive

Double Refinement Network for Efficient Monocular Depth Estimation

Author: Bogomolov Pavel
Bubnova Valeriya
Durasov Nikita
Konushin Anton
Romanov Mikhail
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/08/2020
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref